Serveur d'exploration sur l'OCR

Attention, ce site est en cours de développement !
Attention, site généré par des moyens informatiques à partir de corpus bruts.
Les informations ne sont donc pas validées.

Arabic character recognition using fourier descriptors and character contour encoding

Identifieur interne : 002D40 ( Main/Exploration ); précédent : 002D39; suivant : 002D41

Arabic character recognition using fourier descriptors and character contour encoding

Auteurs : Sabri A. Mahmoud [Arabie saoudite]

Source :

RBID : ISTEX:5CFA95A6CE082E7A147A16B2B6D5C27C27681127

Descripteurs français

English descriptors

Abstract

Normalized Fourier descriptors are known to be invariant to scale, translation, and rotation. This technique was used by researchers of Latin OCR yielding acceptable results. In addition, contour analysis was used in object recognition with success. Both techniques are adopted as they are necessary for the recognition of Arabic characters with acceptable recognition rates. This combination was deemed necessary due to the special characteristics of Arabic characters that have some very similar characters. The character images are smoothed by a statistically-based algorithm to eliminate noise. Then, the contours of the image (namely the character primary part, the dots, and hole contours) are extracted. Fourier descriptors and curvature features of the primary part of the character are computed. These features of the training set are used as the model features. The features of an input character are compared to the models' features using a distance measure. The model with the minimum distance is taken as the class representing the character. The dots' and holes' features are then used to specify the particular character. Experimental results have shown that the combination of the Fourier descriptors, the curvature features and the use of dots' and holes' features to be powerful in successfully classifying Arabic characters. Recognition rates of 100% were achieved for the model classes. However, this rate has come down to 98% in the post-recognition phase of identifying the specific characters. The major part of these errors come from corrupted data.

Url:
DOI: 10.1016/0031-3203(94)90166-X


Affiliations:


Links toward previous steps (curation, corpus...)


Le document en format XML

<record>
<TEI wicri:istexFullTextTei="biblStruct">
<teiHeader>
<fileDesc>
<titleStmt>
<title>Arabic character recognition using fourier descriptors and character contour encoding</title>
<author>
<name sortKey="Mahmoud, Sabri A" sort="Mahmoud, Sabri A" uniqKey="Mahmoud S" first="Sabri A." last="Mahmoud">Sabri A. Mahmoud</name>
</author>
</titleStmt>
<publicationStmt>
<idno type="wicri:source">ISTEX</idno>
<idno type="RBID">ISTEX:5CFA95A6CE082E7A147A16B2B6D5C27C27681127</idno>
<date when="1994" year="1994">1994</date>
<idno type="doi">10.1016/0031-3203(94)90166-X</idno>
<idno type="url">https://api.istex.fr/document/5CFA95A6CE082E7A147A16B2B6D5C27C27681127/fulltext/pdf</idno>
<idno type="wicri:Area/Istex/Corpus">000209</idno>
<idno type="wicri:Area/Istex/Curation">000206</idno>
<idno type="wicri:Area/Istex/Checkpoint">002035</idno>
<idno type="wicri:doubleKey">0031-3203:1994:Mahmoud S:arabic:character:recognition</idno>
<idno type="wicri:Area/Main/Merge">002F07</idno>
<idno type="wicri:source">INIST</idno>
<idno type="RBID">Pascal:94-0571991</idno>
<idno type="wicri:Area/PascalFrancis/Corpus">000A99</idno>
<idno type="wicri:Area/PascalFrancis/Curation">000901</idno>
<idno type="wicri:Area/PascalFrancis/Checkpoint">000A73</idno>
<idno type="wicri:doubleKey">0031-3203:1994:Mahmoud S:arabic:character:recognition</idno>
<idno type="wicri:Area/Main/Merge">003067</idno>
<idno type="wicri:Area/Main/Curation">002D40</idno>
<idno type="wicri:Area/Main/Exploration">002D40</idno>
</publicationStmt>
<sourceDesc>
<biblStruct>
<analytic>
<title level="a">Arabic character recognition using fourier descriptors and character contour encoding</title>
<author>
<name sortKey="Mahmoud, Sabri A" sort="Mahmoud, Sabri A" uniqKey="Mahmoud S" first="Sabri A." last="Mahmoud">Sabri A. Mahmoud</name>
<affiliation wicri:level="1">
<country xml:lang="fr">Arabie saoudite</country>
<wicri:regionArea>Computer Engineering Department, College of Computers and Information Sciences, King Saud University, P.O. Box 51405, Riyadh 11543</wicri:regionArea>
<wicri:noRegion>Riyadh 11543</wicri:noRegion>
</affiliation>
</author>
</analytic>
<monogr></monogr>
<series>
<title level="j">Pattern Recognition</title>
<title level="j" type="abbrev">PR</title>
<idno type="ISSN">0031-3203</idno>
<imprint>
<publisher>ELSEVIER</publisher>
<date type="published" when="1993">1993</date>
<biblScope unit="volume">27</biblScope>
<biblScope unit="issue">6</biblScope>
<biblScope unit="page" from="815">815</biblScope>
<biblScope unit="page" to="824">824</biblScope>
</imprint>
<idno type="ISSN">0031-3203</idno>
</series>
<idno type="istex">5CFA95A6CE082E7A147A16B2B6D5C27C27681127</idno>
<idno type="DOI">10.1016/0031-3203(94)90166-X</idno>
<idno type="PII">0031-3203(94)90166-X</idno>
</biblStruct>
</sourceDesc>
<seriesStmt>
<idno type="ISSN">0031-3203</idno>
</seriesStmt>
</fileDesc>
<profileDesc>
<textClass>
<keywords scheme="KwdEn" xml:lang="en">
<term>Arabic character recognition</term>
<term>Character recognition</term>
<term>Contour analysis</term>
<term>Courbure feature</term>
<term>Direction feature</term>
<term>Fourier descripteur</term>
<term>Pattern recognition</term>
</keywords>
<keywords scheme="Pascal" xml:lang="fr">
<term>Analyse contour</term>
<term>Caractéristique courbure</term>
<term>Caractéristique direction</term>
<term>Descripteur Fourier</term>
<term>Reconnaissance caractère</term>
<term>Reconnaissance caractère arabe</term>
<term>Reconnaissance forme</term>
</keywords>
</textClass>
<langUsage>
<language ident="en">en</language>
</langUsage>
</profileDesc>
</teiHeader>
<front>
<div type="abstract" xml:lang="en">Normalized Fourier descriptors are known to be invariant to scale, translation, and rotation. This technique was used by researchers of Latin OCR yielding acceptable results. In addition, contour analysis was used in object recognition with success. Both techniques are adopted as they are necessary for the recognition of Arabic characters with acceptable recognition rates. This combination was deemed necessary due to the special characteristics of Arabic characters that have some very similar characters. The character images are smoothed by a statistically-based algorithm to eliminate noise. Then, the contours of the image (namely the character primary part, the dots, and hole contours) are extracted. Fourier descriptors and curvature features of the primary part of the character are computed. These features of the training set are used as the model features. The features of an input character are compared to the models' features using a distance measure. The model with the minimum distance is taken as the class representing the character. The dots' and holes' features are then used to specify the particular character. Experimental results have shown that the combination of the Fourier descriptors, the curvature features and the use of dots' and holes' features to be powerful in successfully classifying Arabic characters. Recognition rates of 100% were achieved for the model classes. However, this rate has come down to 98% in the post-recognition phase of identifying the specific characters. The major part of these errors come from corrupted data.</div>
</front>
</TEI>
<affiliations>
<list>
<country>
<li>Arabie saoudite</li>
</country>
</list>
<tree>
<country name="Arabie saoudite">
<noRegion>
<name sortKey="Mahmoud, Sabri A" sort="Mahmoud, Sabri A" uniqKey="Mahmoud S" first="Sabri A." last="Mahmoud">Sabri A. Mahmoud</name>
</noRegion>
</country>
</tree>
</affiliations>
</record>

Pour manipuler ce document sous Unix (Dilib)

EXPLOR_STEP=$WICRI_ROOT/Ticri/CIDE/explor/OcrV1/Data/Main/Exploration
HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 002D40 | SxmlIndent | more

Ou

HfdSelect -h $EXPLOR_AREA/Data/Main/Exploration/biblio.hfd -nk 002D40 | SxmlIndent | more

Pour mettre un lien sur cette page dans le réseau Wicri

{{Explor lien
   |wiki=    Ticri/CIDE
   |area=    OcrV1
   |flux=    Main
   |étape=   Exploration
   |type=    RBID
   |clé=     ISTEX:5CFA95A6CE082E7A147A16B2B6D5C27C27681127
   |texte=   Arabic character recognition using fourier descriptors and character contour encoding
}}

Wicri

This area was generated with Dilib version V0.6.32.
Data generation: Sat Nov 11 16:53:45 2017. Site generation: Mon Mar 11 23:15:16 2024